Implementing Shared Memory on Mesh-Connected Computers and on the Fat-Tree
نویسندگان
چکیده
We present deterministic upper and lower bounds on the slowdown required to simulate an (n;m)-PRAM on a variety of networks. The upper bounds are based on a novel scheme that exploits the splitting and combining of messages. This scheme can be implemented on an n-node d-dimensional mesh (for constant d) and on an n-leaf pruned butter y and attains the smallest worst-case slowdown to date for such interconnections, namely, O n1=d(log(m=n))1 1=d for the d-dimensional mesh (with constant d) and O(pn log(m=n)) for the pruned butter y. In fact, the simulation on the pruned butter y is the rst PRAM simulation scheme on an area-universal network. Finally, we prove restricted and unrestricted lower bounds on the slowdown of any deterministic PRAM simulation on an arbitrary network, formulated in terms of the bandwidth properties of the interconnection as expressed by its decomposition tree. 3 List of Symbols Used 1 one l lower-case ell 0 zero O upper-case letter oh O(n) upper-case oh symbol for big-oh notation pn square root a accented lower-case a A A A times A (n) upper-case Greek omega <; ; >; standard inequalities log n logarithm minfX;Y g minimum lower-case Greek alpha subset symbol proper subset symbol jU j set size lower-case Greek lambda lower-case Greek sigma (n) upper-case Greek Theta upper-case Greek gamma upper-case Greek sigma (indicating summation) Ŝ S hat 3 4 3 dot 4 (indicating multiplication) bXc oor of X S union symbol dXe ceiling of X a; b; : : : ; z ellipsis fa; b; cg curly brackets (indicating set notation) < a; b; c > angle brackets (delimiting triple) [k]j square brackets subscripted with j 1 in nity [X;Y ) left square bracket, X, comma, Y , right round bracket plus or minus upper-case Greek phi upper-case Greek delta m m bar x0 x prime ab a to the b-th power (exponentiation) 4
منابع مشابه
Implementing Load-Balanced Switches With Fat-Tree Networks
Load-balanced switches have received a lot of attention lately as they are much more scalable than other existing switch architectures in the literature. One of the most salient features of load-balanced switches is the simplicity of implementing deterministic and periodic connection patterns for its switch fabrics. In this paper, we propose to use fat-tree networks as the switch fabrics in loa...
متن کاملA False-Sharing Free Distributed Shared Memory Management Scheme
Distributed shared memory (DSM) systems on top of network of workstations are especially vulnerable to the impact of false sharing because of their higher memory transaction overheads and thus higher false sharing penalties. In this paper we develop a dynamic-granularity shared memory management scheme that eliminates false sharing without sacrificing the transparency to conventional shared-mem...
متن کاملConstructive, Deterministic Implementation of Shared Memory on Meshes
This paper describes a scheme to implement a shared address space of size m on an n-node mesh, with m polynomial in n, where each mesh node hosts a processor and a memory module. At the core of the simulation is a Hierarchical Memory Organization Scheme (HMOS), which governs the distribution of the shared variables, each replicated into multiple copies, among the memory modules, through a casca...
متن کاملFFTs on Mesh Connected Computers
The most eeective use of mesh connected computers is achieved by paying careful attention to the organization of the storage and movement of data. For an important class of algorithms the formalization of the different operations they perform lead to an uniied treatment for them and may result in important simpliications. In this work we apply this point of view to the Fast Fourier Transform (F...
متن کاملSimulating Shared Memory in Real Time: On the Computation Power of Reconfigurable Architectures
We consider randomized simulations of shared memory on a distributed memory machine (DMM) where the n processors and the n memory modules of the DMM are connected via a reconfigurable architecture. We first present a randomized simulation of a CRCW PRAM on a reconfigurable DMM having a complete reconfigurable interconnection. It guarantees delay O(log *n), with high probability. Next we study a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Inf. Comput.
دوره 165 شماره
صفحات -
تاریخ انتشار 2001